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1.  Introduction. 


By  performing  experiments  at  the  levels  XpX^,..., 


we  obtain  observations  y^  » a + 8xi  + A^  (i»l,2, ...),  the  A's  being  inde- 
2 

pendent  N(0,o  ) random  variables,  and  a and  8 the  unknown  linear  regression 
coefficients.  The  x's  are  at  the  disposal  of  the  experimenter,  subject  to 
the  restriction  of  being  within  a given  range,  say  x'  < x < x".  It  is 
desired  to  design  an  experiment — i.e.,  to  specify  the  levels  at  which  the 
experiment  is  to  be  performed— to  estimate  that  x,  say  0,  for  which  E(y) 
equals  some  specified  valuej  we  choose  this  value  to  be  zero  without  loss 
of  generality.  Then,  we  are  to  estimate  ft  = -a/8.  We  shall  assume  that 
x'  < 0 < x";  moreover,  we  transform  the  x-axis  so  that  x'  = -1,  x"  * +1, 

(See  the  diagrams,  pg.  13  .) 

We  propose  t * -a^/b^  as  an  estimate  of  0,  and  b^  being  the 
least  squares  estimates  of  a and  0,  respectively,  based  on  N observations, 
assuming  the  x's  are  determined  before  experimentation  begins.  In  this 
case,  t,T  is  the  maximum  likelihood  estimate  of  9 since  a and  b^  are  maximum 
likelihood  estimates.  If  the  x's  are  chosen  sequentially  so  that  they  are 
random  variables,  we  use  the  estimate  t^,  computing  a^  and  b^  from  the  same 
formulas.  From  regression  theory,  we  have 
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( Sometimes  we  shall  omit  the  subscripts  cn  a^,  b^,  and  without  fear  of 
confusion. ) 

We  shall  give  designs  of  experiments  for  estimating  6 based  on  cer- 
tain "optimum"  criteria  in  both  the  non-sequential  and  sequential  cases. 
Properties  of  both  the  non-sequential  and  sequential  estimates  t,T  are  dis- 
cussed,  including,  in  the  non-sequential  case,  an  approximation  to  the  distri- 
bution of  tM  and  a confidence  interval  for  9.  Finally,  some  examples  have 
been  constructed  to  illustrate  the  character  of  the  designs. 

2.  IJon-seqnent.jal  estimation.  We  assume  the  x's  to  be  fixed  by  the 

experimenter  in  advance  of  the  experimentation.  From  regression  theory,  we 

know  that  (a,b)  has  a bivariate  normal  distribution  with  means  a and  13, 

2 2 

variances  aa  and  c*b  , and  covariance  crab,  where 
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For  later  reference,  if  we  replace  a above  by  s , an  estimate  of  a"  on  N-2 
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d.f.,  we  obtain  the  estimates  s&  , s.Q  , and  sflb,  respectively,  which  are 
independent  of  a and  b. 

It  may  be  shown  that  the  ratio  of  two  normally  distributed  variables 
has  no  finite  moments  (except  in  special  cases):  hence,  t * -a/b  has  no 
finite  moments.  This  would  not  be  so  if  b could  not  take  on  values  in  a 
small  interval  about  zero;  hence,  if  the  coefficient  of  variation  of  b,  vb, 
is  sufficiently  small,  such  an  event  would  not  occur  in  practice.  Therefore, 
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we  nay  suppose  that  if  we  give  expansions  in  powers  of  for  the  mean  and 
variance  cf  t,  that  the  first  few  terms  in  the  expansions  will  give,  in 
practice,  reasonable  measures  of  location  and  scale,  respectively.  The 
symbols  for  mean  and  variance  of  t used  below  are  to  be  understood  in  this 
light. 

Using  the  series  expansion  for  the  expectation  of  the  ratio  of  two 
normally  distributed,  variables  given  by  Rao  /” 1,  pp.  13'3-h  7>  we  have 
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where  v^  = o,_/h  = c/ttZ(x-3c)  . By  a development  similar  to  that  in  Rao,  we 
obtain 
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( Further  justification  may  be  given  for  these  expansions:  (1)  If  we 

assume  b to  have  a truncated  (at  b*0)  normal  distribution,  the  variance  of 
t is  finite  and  the  first  two  terms  in  the  expansion  (U)  may  be  shown  to 
be  correct  up  to  the  proper  order  /~2,  pp.  353-U,  35>8_7»  (2)  If  we  ex- 
pand t as  a function  of  A - (Ap...,&^)  in  a Haclaurin  series  and  take 
expectation  and  variance  term-wise,  we  obtain  the  same  first  two  terms  as 
given  in  (3)  and  (1*),  even  if  the  A's  are  not  assumed  normally  distributed.) 

We  can  reduce  the  bias  of  t and  the  variance  of  t,  as  given  by  (3) 

2 

and  (U),  simultaneously  by  choosing  the  x*s  so  that  x-9  is  small  and  2 (x-S) 
is  large.  We  shall  choose  the  x's  so  that  Z(x-x)  will  be  maximized  subject 
to  x being  fixed  close  to  9;  this  is  accomplished  by  performing  all  experi- 
ments at  one  of  the  two  extreme  levels  n at  x'(*-l)  and  N-n  at  x"(*+l) 
where  n is  chosen  so  as  to  make  the  following  approximation  as  close  as  pos- 
sible: T?  * (N-2n)/N  * 9;  i.e.,  n is  the  closest  integer  to  -^1(1-6).  These 
will  be  termed  the  optimum  criteria  in  the  non-sequential  case.  Thus  a 
good  design  will  require  some  a priori  knowledge  of  9j  if  none  is  available, 
it  would  appear  reasonable  to  retain  the  two-level  design  with  n&N/2. 

(As  further  justification  for  the  optimum  criteria,  we  note  that: 

(1)  Maximizing  the  term  corresponding  to  9 in  the  inverse  of  the  informa- 
tion matrix  for  (8,9)  leads  to  the  same  criteria.  (2)  According  to 
Fieller  (see  Section  5),  we  may  obtain  a confidence  interval  for  9 with  length 
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where  t0  is  a constant.  If  wa  replace  (a,b,s  ) by  (a, 6, o'),  minimization 
of  the  length  of  the  confidence  interval  is  essentially  accomplished  by 
the  above  criteria,) 

If  n observations  are  taken  at  x ■ -1  and  N-n  at  x » +1,  it  may  be 
shown  that 


a;T  * ^(fz).  \ * -|(y-z),  tw  - (y*z)/(y-2), 


where  here  y denotes  the  average  of  the  observations  at  x«-l  and  z the 
average  of  those  at  x»+l. 


3.  Properties  of  non-sequentlal  estimates.  For  designs  in  which  x-6 
— 2 

is  very  small  and/or  2(x~x)  is  large  (in  particular,  for  optimum  non- 

2 2 2 2 
sequential  designs),  Var  t = o /NR  . This  may  be  estimated  by  s /Nb  . We 

2 2 

find  that  s /Nb  has  no  finite  moments,  but  by  arguments  similar  to  those 
above,  we  obtain 
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Hence,  s /Nb  is  a consistent  estimate  of  o /nb  if  z(x-x)  tends  to  infinity 

with  Ns  moreover,  it  is  a conservative  estimate  in  that  the  bias  is  positive. 

t^,  being  the  maximum  likelihood  estimate  of  9,  has  all  the  well- 

known  properties  of  such  estimates,  in  particular,  consistency  and  asymptotic 

normality,  assuming  Z(x-x)  tends  to  infinity  with  N.  The  asymptotic  variance 

is  a2 Air2. 
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l*.  The  distribution  of  the  rion-aeguential  estimate.  Geary  /~3  7 gives 
an  approximation  to  the  distribution  of  the  ratio  of  two  normal  variables, 
the  error  being  small  if  the  coefficient  of  variation  of  the  denominator 
variable  is  small.  Applying  his  work,  with  some  refinement,  to  the  variable 
t = -a/b,  we  have,  denoting  the  c.d.f.  of  t?T  by  F , 

FN(t)  •^/'u(t)_7  + R(t) 

where  is  the  standard  normal  c.d.f,, 
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Now  R(t)  is  a monotone  increasing  function  of  t and  varies  from  -<£(-l/vb) 
to  +3l(-V%).  Hence,  for  small  vfe, 


In  particular,  if  O.lfJO,  'R(t)i  < 0,01. 
For  two-level  designs, 
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and  the  normal  approximation  is  valid  within  1 % 
n(N-n)/N  > 1.353  o2/\ 32. 


if  vb  < 0.U30  or 


5*  Confidence  interval  for  9 (non-sequential  case).  Fieller  jTh  7 

develops  exact  confidence  intervals  for  the  ratio  of  two  normal  variables 

based  on  the  Student-Fisher  t-distribution.  He  reasons  as  follows? 

2 2 2 

Since  a + 9b  is  normal  and  s„  + 2Qs  , + 9 c.  is  an  independent  estimate  of 

at  au  ij 

its  variance  on  N-2  d.f.,  it  follows  that 


z 


(a+9b)/  /sX  20sab+  92s^ 


has  a t-distribution  with  IT-2  d.f.  Therefore,  for  given  6,  if  t0  is  chosen 
so  that  Pr(|zi<  t0)  * 6,  we  have 
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Pr(z2<  t02)  ■=  Pr/"(a+0b)2<  t02(sa2+  203^  62sb2)J 

Prf(*2-  t02sa2)  + 29(ab  - t02sab)  + 02(b2-  t02sb2)  < 0 J . 


Fieller  shows  that  if  b is  sufficiently  different  from  zero  (specifically, 
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if  b /sfe  > t0  , which  is  likely  if  vfe  is  small),  then  we  have  a confidence 
interval  for  9: 


ab  - t0  s - t0c 

:-?-■? - « 9 2 

b - t0 


ab  - t0  safe+  t0c 
13  - ^o  sb 


where  c * a2s,2-  2abs  , + b^s  tQ^(s  2s,  2-  s ,^). 

b ab  a o ' a b ab 

6.  Sequential  estimation.  We  now  suppose  the  experiments  to  be 
performed  sequentially,  the  level  of  each  experiment  being  determined  on 
the  basis  of  the  previous  observations.  Specifically,  we  suppose  observa- 
tions to  be  taken  in  groups  of  k (a  positive  integer),  the  levels  in  the 
(m+1)  group  being  determined  on  the  basis  of  the  observations  in  the  first 
m groups:  thus 


(5) 
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say.  Hence,  the  y»s  are  no  longer  independent,  and  the  previous  results  do 
not  necessarily  hold.  We  propose  the  same  estimate,  t^  =>  -a^/b^,  where 
and  b^  are  defined  by  (1)  and  (2)  though  now  they  may  not  be  normally 
distributed. 

Expanding  t^  as  a function  of  A * from  equations  (1),  (2), 

and  (5),  in  a jiaclaurin  series,  we  obtain: 
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Assuming  that  vs  nay  take  expectation  and  variance  term-wise  (though  they 
may  not  be  existent  as  in  the  non-sequential  case),  we  have 
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where  ^ « xi 

A-0 

Consider  a sequential  plan  such  that  * t^  for  some  f <N;  we  call 

all  such  plana  in  which  observations  are  taken  in  groups  of  k "designs  of 

type  D.  Then  Z m Tt  « t*  ~]  -9,  and  Var  t„  * o2/Np2  ♦ ....  By 

K A-0  *A-0 

evaluating  the  second  term  in  equation  (6),  we  find  it  to  be  aero  for 
designs  of  type  D^.  Hence,  for  such  designs,  the  bias  and  the  variance, 
as  given  by  (6)  and  (7),  are  simultaneously  reduced.  By  evaluation  of 
some  of  the  higher  order  terms,  it  may  be  shown  that  Z(K<)  always  appears 
with  a negative  exponent.  Hence,  as  in  the  non-aequential  case,  a design 
in  which  all  but  the  last  group  of  observations  are  taken  at  x - +1,  and 
the  last  group  is  so  allocated  that  If  - t^_^,  is  optimum  in  the  above 
sense.  We  call  such  a design  a "truncated  two-level  design".  Thus,  for 
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k ■ 1,  the  first  N-l  observations  are  to  be  taken  at  -1  and  +1,  keeping  the 
average  close  to  the  previous  estimate  of  0 so  that  the  last  observation 
may  be  taken  at  some  x between  -1  and  +1  in  such  a way  that  x » t.^ . 
Explicitly,  the  sequential  design  is  s 


(8) 
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X1  = -1  X2  - +1 

x . * sgn(t  - l<m<N-2  (after  the  mth  stage,  we 

suppose  n observations  have  been  taken  at  -1  and  m-n  at  +1) 


N-2 

xN_1  = min(NtN_2-  Z xi  + 1, 


+1) 


N-l 

■ “Vl  - * X1  ’ 

1»1 


where  sgn(u)  » +1  if  u > 0,  -1  if  u < 0. 

(If  we  wish  to  add  further  single  observations  with  the  possibility 
of  terminating  the  experiment  at  any  step  but  yet  retaining  a design  of 
type  Dp  we  can  take 


x - (n+l)t  - nt  i (n>N)  . 

n+1  v ' n n-l  v - 


n+1 


Then,  at  any  stage,  we  have  S • tfi.  Moreover,  such  observations 
will  permit  a check  on  the  linearity  of  the  regression  line,  if  desired.) 


7.  Properties  of  sequential  estimates.  In  the  two-level  sequential, 
F.,(t)  — the  distribution  of  non- sequential  t in  Section  li  — is  the  condi- 
tional  distribution  of  t given  that  n observations  were  taken  at  -1  and 
N-n  at  +1. 
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In  the  sequential  case,  we  have  no  assurance  that  is  a maximum 

likelihood  estimate;  however,  we  do  prove  consistency i Consider  an  arbi- 

2 

trary  sequential  design  in  which  Z(x-x)  tends  in  probability  to  infinity 
with  N.  (For  two-level  designs,  this  assumption  is  that  n and  N-n  tend  in 
probability  to  infinity  with  N.)  Now  xi  is  independent  of  A^  for  i<j,  so 
that  E(xiAi)(x^A^)  ■ E(xiA^x^)EA^  B 0 (i<j);  thus 

52A  * 0,  E(IA)2  - EZA2  * Ntf2  , 

EZxA  - 0,  E(ZxA)2  - EIx2A2  < EZA2  * No2  ; 

2 

moreover,  |Zx|  <N  and  Zx  < N.  By  application  of  Tchebychef f ' s Inequality  on 

11  r-  — 

^ZA  and  ^ZxA  and  Slutsky’s  Theorem  2,  pg.  2 $5  7,  the  consistency  of  aN  and 
bN  as  estimates  of  a and  0 follows  from  equations  (1)  and  (2)  and  the  above 
remarks,  irrespective  of  the  distribution  of  the  A's.  Further  application 
of  Slutsky's  Theorem  proves  the  consistency  of  t^  as  an  estimate  of  0 under 
the  stated  conditions. 

8.  Examples . Two  samples  have  been  constructed  by  assigning  the 

2 

values  a ■ 1,  8 ■ U (8  * -l/U),  a * 1,  x'  » -1,  x"  - +1,  and  taking  values 
of  the  A's  from  Mahalanobis'  "Tables  of  a random  sample  from  a normal 
population"  . (Sample  1 consists  of  the  first  20  values  and  Sample  2 

the  second  20  in  Plate  1.)  Four  designs  were  used  with  each  sample,  tvro 
non-sequential  and  two  sequential,  and  estimates  were  computed  on  the  first 
10  observations  as  well  as  on  the  total  of  20.  The  designs  used  were: 


i*- 
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Design  It  x^*  x2=  -1,  x^“  x^»  -1/2,  x^-  x^-  0,  x^»  Xg-  +1/2, 
x^*  x^0=  +1;  similarly  for  x.^  to  x2Q. 

Design  2:  two-level  non-sequential  design  with  n - N/2,  x^«  (-1)1. 

Design  3j  x^  -1,  x2»  +1,  x^-  ntn_i“  (n"1)tn_2  ^ n> ^ 

(see  the  last  paragraph  in  Section  6). 

Design  U:  "optimum"  sequential  design,  given  by  (8).  (The 

designs  for  the  samples  of  10  were  not  truncated. ) 


Before  presentation,  the  data  were  transformed  by  the  transformation 
x'*=  1 + hx  so  that  a*=  0,  0*-  1,  o2*  1,  x*»  - -3,  x*"  = +5;  hence  is 
an  estimate  of  6*=  0.  (See  the  diagrams  below.)  The  estimates  of  e" 
from  each  design  and  variance  estimates  from  Designs  1 and  I4  are  given  in 
Table  I below;  the  levels  of  the  designs  are  given  in  Table  II;  confidence 
intervals  for  0 are  given  in  Table  III. 

9.  Acknowledgement . Acknowledgement  is  hereby  made  to  Professor 
H.  Hotelling  who  proposed  this  problem  and  gave  many  helpful  suggestions, 
and  to  Professor  S.  N.  Roy  for  suggestions  after  reading  the  manuscript. 
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DIAGRAMS 
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TABLE  I 


t 'f  (Estimate  of  9*»  0)  and  Variance  Estimates 
(after  the  transformation  x ■=  1 + l*x) 
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table  hi 


Confidence  Interval  for  9 
(after  the  transfcmation  1 + Ux) 


6 

— 

N 

Design  2 

Sample  1 

Sample  2 

0.95 

10 

20 

-0.782,  +1.027 
-0.375,  +0.51*0 

-0.31*2,  +1.827 
-0.335,  +0.823 

0.99 

i 

10 

20 

-1.170,  +1.51*1 
-0.536,  +0.72U 

-0.772,  +2. 518 
-0.526,  +1.01*8 
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